USING AGENTS TO EXTEND CAPABILITIES OF LARGE LANGUAGE MODELS

Using Agents to Extend Capabilities of Large Language Models

The power of Large Language Models (LLMs) today lies in their ability to process vast amounts of data and generate human-like text, enabling them to perform a wide range of tasks not limited to language translation, text generation and information retrieval. These models can understand context, generate coherent responses, and even perform reasoning tasks to some extent.

Limitation of Large Language Models

Today, LLMs on their own, powerful as they are, face challenges due to their limited ability to interact with the outside world. Some key limitations include:

  • Real-World Interaction - LLMs may lack the ability to directly perceive and influence the real world. They are confined to processing information based on their training data and cannot interact with external systems or data.
  • Static Knowledge - LLMs are like libraries with fixed training data. They may not be able to continuously acquire new information, which can lead to outdated or incomplete responses as real-world knowledge evolves.
  • Reason and Act - LLMs can generate text but may struggle with reasoning and decision-making in complex scenarios. They lack the ability to select the right tools or take actions based on external inputs.
  • Dynamic Data Access - LLMs may not access dynamic or real-time information, limiting their ability to provide up-to-date responses or perform tasks that require current data.
  • Dependency on Training Data - LLMs are limited by what they have learned during training. They may not adapt well to new situations or tasks that go beyond their initial training data.
One approach to overcome the limitation of LLMs is to use Agents.

Extend LLM with Agents

Agents can play a crucial role in enhancing LLMs. For example, tools such as Data Stores, Extensions, and Functions may interact with LLM Models and provide agents with access to real-time information, external systems, and specialised capabilities, thereby enhancing the performance and reliability of LLMs .

The following describes the concept of agents, and extensions and their roles:

Agents:

  • are applications that observe the environment, act upon it, and can achieve goals independently. They can reason, plan, and execute tasks based on the tools available to them.
  • can be proactive in reaching their goals and can make decisions based on the information they have.
  • can manage session history, allowing for multi-turn interactions with users. They can use cognitive architectures like ReAct , Chain-of-Thought , or Tree-of-Thoughts to guide their reasoning and decision-making processes.
  • may improve their performance over time through feedback and experience.

Extensions

  • can bridge the gap between an agent and external APIs in a standardised way. They allow agents to seamlessly execute APIs regardless of their underlying implementation.
  • may provide examples and parameters to teach agents how to use API endpoints effectively. They enable agents to dynamically select the most appropriate extension for a given task based on the provided examples.
  • enhance the agent's ability to interact with external systems and access real-time information, expanding the range of tasks they can perform.

Thus a “LLM Powered Agent” can consist of an LLM serving as the agent’s brain, as well as key components for planning, memory and tool use . In more complex systems, multi-agent architectures may also be considered.

Realising Agentic Behaviour

To realise agentic behaviour, popular LLM frameworks such as LangChain or LlamaIndex can be used. Some practical use cases that can deliver immediate value include Agentic RAG, report generation, customer support and SQL Agents.

Cybersecurity Implications

The integration of LLM agents in cybersecurity presents as a double-edged sword, offering powerful tools for both attackers and defenders. On the one hand, LLM agents can support automated threat detection and response in the context of threat identification, alert prioritisation, context-driven response generation, security policy enforcement, and threat handling . On the other hand, LLM agents may also pose threats to cybersecurity. For example, teams of LLM agents can be used to exploit real-world, zero-day vulnerabilities.

Summary

By integrating reasoning, logic, and access to external information, LLM agents can make better decisions, engage with external systems, and produce responses or actions based on real-time data. This allows agents to manage complex tasks autonomously, extend the capabilities of language models, and deliver to users more accurate and reliable results.

Used ethically, LLM agents can bring much benefit. When developed and deployed responsibly, they have the potential to revolutionise various fields and improve our lives in countless ways, including and not limited to areas of Education, Healthcare, Environmental Science, and Social Good.

References

  1. Google whitepaper written by Julia Wiesinger, Patrick Marlow and Vladimir Vuskovic, https://www.kaggle.com/whitepaper-agents
  2. Yao, S. et al., 2023, ‘ReAct: Synergizing Reasoning and Acting in Large Language Models’, https://doi.org/10.48550/arXiv.2210.03629
  3. Wei, J. et al., 2023, ‘Chain-of-Thought Prompting Elicits Reasoning in Large Language Models’, https://doi.org/10.48550/arXiv.2201.11903
  4. Introduction to LLM Agents, https://developer.nvidia.com/blog/introduction-to-llm-agents/
  5. LangChain https://python.langchain.com/v0.1/docs/modules/agents/
  6. LLamaIndex. https://docs.llamaindex.ai/en/stable/use_cases/agents/
  7. Molleti, R. et al., 2024, ‘Automated threat detection and response using LLM agents’, https://doi.org/10.30574/wjarr.2024.24.2.3329
  8. Fang R. et al., 2024, ‘Teams of LLM Agents can Exploit Zero-Day Vulnerabilities’. Available at: https://arxiv.org/pdf/2406.01637


Author

Contact Information: Leong Siang Huei (Dr)
School of Information Technology
Nanyang Polytechnic
E-mail: [email protected]

Dr Leong Siang Huei is a Senior Lecturer at Nanyang Polytechnic’s School of Information Technology. Prior to academia, he did R&D at both Government and private organisations, working on Artificial Intelligence, Machine Learning, Medical Robotics, Instrumentation, Building Automation and IoT. He has authored/co-authored more than 40 research papers in peer reviewed journals and 14 granted US Patents.